Statistical Mechanics of Semi-Supervised Clustering in Sparse Graphs (Abstract)

نویسندگان

  • Greg Ver Steeg
  • Aram Galstyan
  • Armen E. Allahverdyan
چکیده

We develop a statistical mechanics based approach for studying semi–supervised clustering in graphs in the presence of must and cannot links. We focus on bi– cluster graphs, and study the impact of the semi–supervision by varying the constraint density and overlap between the clusters. Recent results for unsupervised clustering in sparse graphs indicate that there is a critical ratio of within–cluster and between–cluster connectivities below which clusters cannot be recovered with better than random accuracy. The goal of this paper is to examine how this criticality is affected in the presence of pair–wise constraints.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Statistical Mechanics of Semi-Supervised Clustering in Sparse Graphs

We theoretically study semi–supervised clustering in sparse graphs in the presence of pair–wise constraints on the cluster assignments of nodes. We focus on bi–cluster graphs, and study the impact of semi–supervision for varying constraint density and overlap between the clusters. Recent results for unsupervised clustering in sparse graphs indicate that there is a critical ratio of within–clust...

متن کامل

Sparse Representation on Graphs by Tight Wavelet Frames and Applications

In this paper, we introduce a unified theory of tight wavelet frames on non-flat domains in both continuum setting, i.e. on manifolds, and discrete setting, i.e. on graphs; discuss how fast tight wavelet frame transforms can be computed and how they can be effectively used to process graph data. We start from defining multiresolution analysis (MRA) generated by a single generator on manifolds, ...

متن کامل

The Graduate School SEMI - SUPERVISED CLUSTERING FOR HIGH - DIMENSIONAL AND SPARSE FEATURES

Clustering is one of the most common data mining tasks, used frequently for data organization and analysis in various application domains. Traditional machine learning approaches to clustering are fully automated and unsupervised where class labels are unknown a priori. In real application domains, however, some “weak” form of side information about the domain or data sets can be often availabl...

متن کامل

Learning With ℓ1-Graph for Image Analysis

The graph construction procedure essentially determines the potentials of those graph-oriented learning algorithms for image analysis. In this paper, we propose a process to build the so-called directed l1-graph, in which the vertices involve all the samples and the ingoing edge weights to each vertex describe its l1-norm driven reconstruction from the remaining samples and the noise. Then, a s...

متن کامل

Semi-supervised Hierarchical Clustering Analysis for High Dimensional Data

In many data mining tasks, there is a large supply of unlabeled data but limited labeled data since it is expensive generated. Therefore, a number of semi-supervised clustering algorithms have been proposed, but few of them are specially designed for high dimensional data. High dimensionality is a difficult challenge for clustering analysis due to the inherent sparse distribution, and most of p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011